Unit 8 - Gradient Descent Analysis

Gradient Descent Experiment

Based on Mayo's (2017) article and the gradient descent notebook tutorial, I experimented with different values of learning_rate and iterations to observe their impact on the cost function's convergence.

Experimental Setup

Parameters Used

learning_rate = 0.15
iterations = 150

Key Observations

The cost dropped significantly during the early iterations, showing that the model was learning efficiently.
No oscillation or divergence was observed at 0.15, which means the learning rate was aggressive but still stable.
After ~50–60 iterations, improvements in cost became marginal—indicating diminishing returns with additional iterations.

Dataset Analysis

The experiment used a simple dataset [1,2,3,4,5] with outputs [5,7,9,11,13], which represents a perfect linear relation y = 2x + 3. This clean dataset allowed for clear observation of the gradient descent behavior.

Reflection

This experiment reinforced the importance of tuning the learning rate. A smaller value would have required more iterations to reach the same result, while a larger one could have risked overshooting the optimal point.
It also showed how the nature of the dataset matters. This one is small and perfectly linear, so convergence was smooth. On messier, real-world datasets, careful tuning becomes even more critical.

Source Artifacts | 📊 Analysis Notebook | 📋 Reflection